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Abstract 

We study the propagation of very large amplitude localized excita- 
tions in a model of DNA that takes explicitly into account the helicoidal 
structure. These excitations represent the "transcription bubble", where 
the hydrogen bonds between complementary bases are disrupted, allowing 
access to the genetic code. We propose these kind of excitations in alter- 
native to kinks and breathers. The model has been introduced by Barbi 
et al. [Phys. Lett. A 253, 358 (1999)], and up to now it has been used 
to study on the one hand low amplitude breather solutions, and on the 
other hand the DNA melting transition. We extend the model to include 
the case of heterogeneous chains, in order to get closer to a description 
of real DNA; in fact, the Morse potential representing the interaction be- 
tween complementary bases has two possible depths, one for A-T and one 
for G-C base pairs. We first compute the equilibrium configurations of a 
chain with a degree of uncoiling, and we find that a static bubble is among 
them; then we show, by molecular dynamics simulations, that these bub- 
bles, once generated, can move along the chain. We find that also in the 
most unfavourable case, that of a heterogeneous DNA in the presence of 
thermal noise, the excitation can travel for well more 1000 base pairs. 



1 INTRODUCTION 

In the attempt to describe some aspects of DNA fmictioning, the theory of non- 
linear dynamical systems has found an interesting application to this important 
biological structure. In spite of the awareness of the complexity that character- 
izes most of the dynamical processes taking place in biological systems, where 
very often the necessary trigger is constituted by the temporary interaction of 
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different objects, there is nevertheless an effort, in the research in biological 
physics, to grasp some essential features with simple models. 

In this work we are interested in the propagation of very large amplitude 
bubbles, that should be very important in the description of the process of 
transcription. In this respect, the main appeal of onedimensional nonlinear 
models is their possibility to sustain localized excitations, of which our bubbles 
are an example. In this direction there have been works based on some models 
where the essential degree of freedom of each site of the chain is related to the 
opening of the hydrogen bonds between the complementary bases of the double 
stranded DNA (for a review see, e.g., |^ and references therein). 

The important points to address for any given model are the existence and 
the stability of localized excitations, both stationary and moving; after that the 
problem of their formation has to be considered. In this Section we give a very 
brief account of the situation in models with one degree of freedom per base 
pair, limiting ourselves to the problems of existence and stability. 

Let us begin with homogeneous chains. We can start making a distinction 
between models that do not neglect the discreteness of the system, and mod- 
els that are treated in the continuum limit, either from the start or after the 
approximation of the original discrete equations. In both cases there is usu- 
ally a nearest neighbor interaction and a site potential; in the continuum limit 
generally the field equation under study is of the Klein-Gordon type: 

Consider the problem of the existence of localized stationary solutions. We 
know that for equations like (Q) localized solutions, where at both sides of the 
excitation the field (j> is at the minimum of U, are possible only if U has degen- 
erate minima; then (j> has a kink configuration and we have a topological soliton, 
which implies a displacement of a whole side of the chain. One of the most used 
example of (|l|) is the sine-Gordon equation. The stationary solutions really are 
static, i.e., they are equilibrium configurations, that are obtained by solving 
the Newton-like equation I'of^ = and localized solutions, the kinks, are 
found by choosing appropriate "initial conditions" in this equation. If the origi- 
nal model equations were discrete, the equilibrium configurations are such that 
their envelope has a kink structure which is very close to that of the continuum 
equation, with the center exactly on one site (odd kink), or in the middle be- 
tween two sites (even kinks) (see j^, where also the problem of the stability is 
considered). 

In models where the discreteness is taken into account, the problem of so- 
lutions with a topological index can be circumvented; in fact, also with site 
potentials U with a nondegenerate absolute minimum, it is possible to have sta- 
tionary localized excitations, in the form of breathers, in which only few sites 
are coherently involved with a not negligible amplitude in a nonlinear oscilla- 
tion with a given fundamental frequency (for the mathematics of breathers in 
discrete nonlinear lattices and the conditions for their existence and stability 
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see @ |,|). Now there is an additional degree of freedom in the choice of the 
locahzed solution, namely the variation, in proper ranges, of the fundamental 
frequency. In analogy with the situation that arises with kinks, a breather with 
a given fundamental frequency can be centered on one site, which has the largest 
amplitude of oscillation (odd breather), or on two sites, which have the largest 
identical amplitudes (even breather). 

We now turn our attention to the stability of these solutions. The stability of 
a static kink configuration, in lattice models approximated by (||), requires the 
positive definiteness of the hessian matrix of the total potential energy. Instead 
for a breather the stability is checked through the linearization of the equations 
of motion around the stationary solution; the transformation of the perturbation 
in one breather period gives a linear application, whose spectrum determines 
the stability properties: the breather is stable if there are no eigenvalues with 
modulus greater than 1 Q . 

The important point to study is that of the moving capabilities of localized 
excitations. If we are interested in moving solutions of Eq. (||) , the problem is 
easily solved. Any static solution, in particular a localized one, can be trans- 
formed to a moving one, with speed v < vq, through a Lorentz boost (with 
"speed of light" wq); its profile will only be modified by the Lorentz contraction. 
If the original model is discrete and Eq. (|l|) is only its approximation, then 
the movement of the localized excitation will be associated to an energy loss 
through phonon radiation but generally the kink retains a good stability. In |Q 
one can find the treatment of this phonon dressing of the moving kink. 

The study of moving breathers is more difficult. While the existence of 
stationary breathers seems to be quite independent of the characteristics of the 
site potential, at least for sufficiently small coupling between different sites 
the presence of exact moving breathers is associated to some special integrable 
hamiltonians (see |^ for a review). In the other, generic cases, the common 
approach is to study the stability spectrum of stationary breathers, making a 
connection between some of the elements of the spectrum and the "sensitivity" 
to movement ^. The absence of a zeroth order moving solution, analogous 
to the boosted kink of Eq. (|l|), makes the treatment of stability of moving 
breathers less approachable. 

If we consider heterogeneous chains, there are of course more problems, espe- 
cially regarding the stability of moving excitations, in particular breathers. The 
amount of work has been less extensive. There have been studies on nonlinear 
localization in chains with impurities ||l0|| , on kink propagation through regions 
with mass variation MD simulations in a sine- Gordon model of DNA [p!^ ; 
see for a review of results on disordered systems in the continuum limit. 

Let us summarize what are the main problems with kinks and breathers if we 
want to consider them as good candidates for the local openings needed during 
the transcriptions. As we have seen, the stability is a less severe difficulty for 
kinks. For this reason they could be preferable. On the contrary, the breathers 
have the advantage to avoid the problem of the topological index. However, it 
is plausible that, if we want to describe with a breather a region where the bond 
between the complementary bases is temporarily broken, to allow access to the 
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genetic code, this breather must be wide (i.e., the number of sites essentially 
involved in the motion is not very small) and of very large amplitude; unfortu- 
nately, the probability that a breather is stable greatly decreases when its width 
increases and when its amplitude is large |^ (this second point can be easily 
guessed, since for potentials that allow dissociation, large amplitude means low 
fundamental frequency, and therefore a strong probability of resonance with the 
phonon dispersion curve). 

In this work it is our purpose to show that, with a given model with two 
degrees of freedom per base pair, it is possible to put together some of the 
advantages of kinks and breathers, i.e., respectively, a good stability with respect 
to movement, and a local excitation that does not need a topological index. We 
will show that these large amplitude solutions of the model have a satisfying 
stability also with heterogeneity and with thermal noise. We think that these 
properties can represent those of a "transcription bubble" . 

The model has been proposed by Barbi et al. Q, and it is an evolution, 
that takes the helicoidal structure explicitly into account, of the Peyrard-Bishop 
model. This last model was introduced Jist to have a dynamical explanation 
of the melting transition, opposed to methods that offer only equilibrium esti- 
mates of the probability of bond disruption |l^ . A satisfactory melting 
curve was obtained and later the melting of heterogeneous chains and of 
heterogeneous oligonucleotides has been studied |2^, |2|, |2^, . The helicoidal 
model introduced in was there used to build approximate low amplitude 
solutions through the method of the multiscale expansion , and in the 
melting transition was investigated. 

In Section II we introduce the model and we compute its equilibrium con- 
figurations, with their stability properties. In Section III we show the results of 
our MD simulations, together with an approximate computation of the features 
of the moving bubbles. In Section IV we present our discussion and draw some 
conclusions. 



Our starting point is the model introduced (in a somewhat different version) in 
The bases can move only in planes perpendicular to the helix axis; besides, 
the center of mass of the base pair is held fixed, and the two complementary 
bases move simmetrically with respect to the axis of the molecule. Then for 
each base pair there are two degrees of freedom: r„ is the distance between each 
one of the complementary bases in the n-th base pair and the helix axis; 0„ is 
the angle that the line joining the two complementary bases makes with a given 
direction in the planes where the bases move. The potential energy is given by: 



2 THE MODEL 



u 




(2) 



4 



where Ln+i.n is the distance between neighbor bases on the same strand, and 
as a function of r„, r„+i and 9n+i — 6n = is given by: 



Ln+i,n = yh'^ + rl^i + rl - 2r„+ir„ cos^ A6'„ (3) 

where h = 3.4 A is the fixed distance between neighbor base planes and 
i?o = 10 A is the equilibrium value of r„; Lg is the same function computed for 
r„_|_i = r„ — Rq and A0„ — Qq = 7r/5 (10 base pairs per helix turn). Therefore 
the equilibrium configuration is that with r„ ~ Rq and A0„ = Gg for each n, 
which gives the system its helicoidal structure. The natural helicity is right 
handed, and for convenience we take this as the positive rotation for the angles 
On- Actually, equilibrium configurations are also all those in which any A9n 
is chosen indifferently ±0g; however, they are separated by the fundamental 
one by potential barriers, that in our simulations are never crossed. The first 
two terms in are the same of the original PB model [^, which has r„ as 
the only degree of freedom per base pair: there is a Morse potential with Rq 
as the equilibrium distance, and a harmonic interaction between neighbor base 
pairs (stacking interaction); there can be two different values for _D„, Da-t 
for A-T base pairs and Dq-c for G-C base pairs. It is generally assumed that 
Dg-c — ^Da-t- The last term in (^ describes a restoring force that acts 
when the distance L between neighbor bases on the same strand is different 
from Lg. The model has been introduced in without the second term in 
(^, essentially attributing all the stacking interaction to the last term, and with 
an additional three-body term proportional to {9n+i + 9n-i — "^On)^ , to elimi- 
nate the equilibrium configurations with some A0„ = — Og. In this form the 
authors have studied small amplitude breather- like solutions, with the envelope 
described by the nonlinear Schrodinger equation. In |^ the statistical mechan- 
ics of model (|^) has been studied, the authors being interested in the melting 
transition of DNA; in this case the difference from (^) was given by a replace- 
ment of the coupling constant c by a coupling of the form ce^'''--^"+^^^"~^^''\ 
decreasing with Vn+i + Vn increasing. Besides, the restoring force represented 
by the last term was cast in another form, with L fixed (= Lg) and h variable. 
In both works a homogeneous DNA (I?„ constant in n) was considered. We 
use the more complete structure used in , with the second term in (H) more 
related to the stacking interaction, and the last term more related to the rigidity 
of the two single strands, with L variable and h fixed. For simplicity, we do not 
insert a decaying coupling: this last feature has been found, already in the orig- 
inal PB model |l^, as being essentially responsible for sharpening the melting 
transition, which happens at higher temperatures than those we are interested 
in here. Also, the three-body term used in [ |T^ was not necessary, since in our 
simulations, as we have already pointed out, we never had a crossover in the 
sign of A0„. 

The spirit in which we study the model represented by (|^) is the following. 
We consider a chain of a given length, with fixed boundary conditions. We 
know that an interaction with an enzymatic complex (with RNA polymerase) 
is necessary to trigger the process of transcription. We take here, as a working 
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hypothesis, that this external action can be represented by a partial unwinding 
of one of the extremes of our chain, considered as the interaction site. We will 
show that this mechanism can give rise to a travelling bubble (the "transcription 
bubble"), in which several base pairs are open; this bubble, that appears to be 
very stable in a homogeneous chain initially at rest, is interestingly long lived 
also in the case of a heterogeneous chain and with thermal disorder. 

In the remaining of this Section our aim is to give an analytical background 
to the results, obtained by molecular dynamics simulations, that will be pre- 
sented in the next Section. Since the essential dynamical process that we will 
observe is a local opening travelling along the chain, we want to show that this 
movement can be considered, in an adiabatic approximation, as a succession 
of equilibrium configurations, similarly to what happens with the travelling of 
kinks. Consequently, in the first subsection we will study the equilibrium con- 
figurations of our system: we will show how a chain with some uncoiling, caused 
by suitable boundary conditions (i.e., if 9o and On+i are held fixed at values 
such that 9n+i -eo = E„=o < (^^ + 1)60), can have different equilibrium 
configurations. The simplest one, for a homogeneous chain, is given by a ho- 
mogeneous configuration r„ = r and A0„ = AO for each n, for certain values 
of r and A6; for a heterogeneous chain, where Z)„ = Da~t for some n and 
Dn = Dg-c for the other values of n, the configuration is not qualitatively 
very different from the previous, although the precise equilibrium values of r„ 
and AOn depend on the sequence. Another equilibrium configuration in the 
homogeneous case, the one in which we are interested, is one in which a small 
region (about 20 base pairs) is completely open, and at both sides r„ and A0„ 
decay rapidly to homogeneous values. Again, in the heterogeneous case, the de- 
pendence on the sequence does not alter qualitatively the picture. In the second 
subsection we will briefly treat the stability of these equilibrium configurations. 

2.1 Equilibrium configurations 

If we neglect the mass variation between A-T and G-C base pairs, and take a 
proper unit of mass, the equations of motion deriving from (^) are: 

2r„f„6'„ = (4) 



r. 



"n^n T' ^1 n' n^n 



The equilibrium configurations are those that make the right hand sides vanish. 
Therefore we have to solve: 

2aL»„ |^e-2a(r„-fl,o) _ g-a(r-„-%)^ ^ ^^^^^ 

, I Ln+l.n — Lq . , 

< 7 [rn+1 COS AOn - rn\ 

I -t^n+lm 

Ln,n-1 — Lq 



[r„_i COS A6'„_i - r„] > = (5) 
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for n = 1, . . . , TV; here A2r„ = r„+i + r„_i — 2r„. From the structure of Eq. 
(||) it is clear that any solution of (||) and (||) has to be such that the quantity 
represented by, say, the first term in ^) is a constant as a function of n. Let us 
begin considering a homogeneous chain. Then of course the simplest solution is 
to have both r„ and A0„ constant in n. In this case also Ln+i.n is constant in 
n. Therefore, posing Vn — r and A6'„ — A6', and equating Ln+i.n to a constant 
L, one can express A6 as a function of r and L; then, substituting in (||) this 
function, together with L„_|_i_„ = L and r„ = r, one can find (numerically) the 
equilibrium value r (which will depend on L). Going back also the equilibrium 
value A9 can be computed. If the chain is not infinite and there are fixed 
boundary conditions (for n = and n = N + 1), then, for the equilibrium it is 
required that also vq = rjv+i = r and AOq = A9n = A9. It is clear that the 
fundamental equilibrium configuration (r = Rq and A6 — Qq) is obtained for 
L = Lq. For L < Lq we have r > Rq and A9 < Qq (uncoiling), while for L > Lq 
we have r < Rq and A6 > 8o (overcoiling) . When the chain is heterogeneous, 
the corresponding equilibrium solution can be found from the homogeneous one 
with an iterative procedure explained in Appendix A. The solution will depend 
on the sequence of the I?„s; however, it will not be qualitatively very different 
from the homogeneous case. The interesting equilibrium configuration is of 
course that in which we have an open region. In this case, although the first 
term in (^ is constant in n (and equal, say, to s), in+i^n is not itself a constant. 
We have developed a procedure to compute these configurations. Here we only 
give a sketch; more details can be found in Appendix B. In principle, from 

(in+l,n - Lo)r„ + irn siu A9n = sLn+l,n (7) 

it is possible to obtain A9n as a function of r„+i, r„ and s; substituting in 
, we can therefore obtain r„_|_i as a function of r„, r„_i and s; in this way, 
starting from the values for two contiguous r„ , we can compute site by site the 
equilibrium configuration. With a proper choice of the value of s, we obtain 
a solution in which there is a region, of about 20 base pairs, where > Rq 
in such a way to stay in the plateau of the Morse potential; in that region the 
uncoihng (A0„ < 60) is marked. At both sides of the open region the r„s decay 
very rapidly to a value slightly larger than Rq and the A6'„s to a value slightly 
smaller than Qq. As before, after the computation has been performed for a 
homogeneous chain, with the procedure explained in Appendix A we can obtain 
also the configuration for a heterogeneous chain. The two cases do not differ 
qualitatively. In Fig. 1 we show two examples of equilibrium configurations: 
one for a homogeneous chain and another for a heterogeneous chain, in which 
the sequence of A-T and G-C has been chosen randomly; we present only the 
graphs concerning the radial degrees of freedom r„. 
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In the homogeneous case, the configuration is symmetric at both sides of the 
bubble. Besides the obvious translational invariance (for the infinite chain), it is 
possible to have a configuration centered on one site with the largest opening, as 
in Fig. 1, or on two sites with equal and largest openings (the analogous of what 
happens also for the discrete kinks and breathers). For brevity, in the following 
the bubble centered on one site will be denoted odd bubble, and that centered 
on two sites even bubble. In the heterogeneous case, translational invariance is 
lost, but it is not difficult to guess, in view of the method described in Appendix 
A, that an equilibrium configuration occurs for any site or couple of neighbor 
sites chosen as the center of the bubble (obviously now not symmetric). 

2.2 Stability 

In order for an equilibrium configuration to be stable, the hessian matrix of the 
potential U must be positive definite at that point of configuration space. In 
that case, the square roots of twice the eigenvalues of the matrix give the proper 
frequencies of the small oscillations around the equilibrium. We will consider 
here, as an example, the results for s — —0.273 (see Eq. @)), for the cases of 
the odd and the even bubble. However, before treating our particular example, 
we want to note the following fact concerning the stability of these kind of 
configurations. We have found that, depending on the choice of the constant 
s and on the values of the model parameters, both stable and unstable cases 
occur, and often if the odd bubble is stable, then the even bubble is unstable, 
and viceversa. Then one of the smallest eigenvalues of the hessian matrix in the 
stable case is associated to the movement of the bubble along the chain; to be 
more precise, the perturbation that corresponds to the eigenvector associated 
to this small eigenvalue gives rise, once the linear regime has been passed, to 
the movement of the bubble in one direction. During the movement the bubble 
will go (in the adiabatic approximation) through the equilibrium configurations 
constituted by the odd and the even bubbles. We have here a strong similarity 
with the situation that arises with kinks j^, and an analogy with the breather 
sensitivity to movement that we mentioned in the Introduction. 

We now turn to our example with s = —0.273 and with the same parameter 
values that we employ for the simulations (we will give these values at the begin- 
ning of the next Section). We have performed our calculations on a chain with 
100 base pairs, with the bubble in the middle; this should be sufficient to avoid 
boundary effects. For the odd bubble the hessian matrix is positive definite. 
Most of the proper frequencies are associated to phononic excitations; in fact, 
the corresponding eigenvectors are spread throughout the all chain. But a small 
number of eigenvalues correspond to eigenvectors that have components which 
are not negligible only on the sites of the open region. Therefore, they represent 
perturbations of the bubble. Among these, there is the smallest eigenfrequency 
In Fig. 2 we show the eigenvector corresponding to the smallest eigen- 
value. This is clearly associated to the movement of the bubble, according to 
what we pointed out in the previous paragraph. This is proved by the fact that 
in the spectrum of the even bubble the eigenvalues are all positive except one. 
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The positive eigenvalues are very similar to the corresponding ones of the odd 
bubbles, while the negative one is very close, in absolute value, to the smallest 
of the odd bubble. The smallness of this value shows that the open region is 
very "sensitive" to movement H. 



3 RESULTS 

In this Section we show the results of the simulations performed. We have 
simulated a chain of 2500 base pairs, with fixed boundary conditions. The 
parameters of the model have been given the following values: the depths of 
the Morse potential are Da~t = 0.05 eV and Dg-c = 0.075 eV; the width is 
a = 4 A~^; the constant of the harmonic stacking interaction is c = 0.05 eV/A^, 
while that of the restoring force is K = 0.14 eV/A^; we have already given the 
values oi h — 3.4 A, i?o = 10 A and 8o — 7r/5. We have used the second 
order bilateral simplectic algorithm described in ||2^. As anticipated before, 
the travelling bubble is formed by imposing a partial unwinding at one end of 
the chain. After that, the open region travels towards the other end. Let us 
give some more details on the process by which the open region is formed. We 
have fixed boundary conditions. At the left end of the chain we begin to make 
an unwinding. This is done by decreasing the angle A^o = 9i — 9q between 
the "fixed" site at the left of the chain and the first site, i.e., by increasing ^o- 
This causes an opening of the first few sites because of the last term in the 
potential energy (|^) . During the process of formation of the open region, also 
phonons are created, which begin to travel faster than the bubble. At the end 
of the process we observe the bubble travelling towards the right. We have used 
different amplitudes for the increase of flpi that will be specified in the following 
for the different cases. 

We begin by showing the results of the simulation performed for a homoge- 
neous chain initially at rest in the fundamental equilibrium configuration. We 
have increased 9q by 1.25 radians. In Fig. 3 we present the configurations at 
6 different times. It can be noted from the graphs that the travelling bubble 
is practically stable. We have even found that, when it reaches the end of the 
chain, it bounces back. Another thing to be noted, and that is valid also in the 
other situations that we will show, is that outside the bubble the radial coordi- 
nate r„ is practically in the equilibrium position i?0: s-^d correspondingly there 
is no uncoiling. This appears to be in contrast with the results of the previous 
Section, concerning the structure of the equilibrium configurations. We remind 
that we found a degree of uncoiling, and a value of r„ somewhat larger than 
i?o, at the sides of the bubble. At the end of this Section we will try to give 
an argument to show the reason why the sides of a travelling bubble can be in 
the equilibrium configuration. This fact is probably a good point, in view of a 
possible biological significance of the dynamical processes of this model, and we 
will comment on that in the last Section. 

In Fig. 4 we show the situation that arises in a heterogeneous chain, again 
initially at rest in the fundamental state. The sequence of base pairs has been 
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chosen at random. We can note that the bubble progressively decreases its 
amplitude, contrary to what happens in the homogeneous case, and in fact in 
the last configuration practically we do not see it any more. However, before 
disappearing the excitation has travelled well beyond 1000 base pairs. Here 
we have increased 9o by a greater quantity than before, namely by 2 radians. 
It is not difficult to understand the reason of the different behavior between 
homogeneous and heterogeneous chains. In the first case the spectrum of the 
hessian matrix in the equilibrium positions for given s (see Eq. (^) is the 
same for all odd bubbles and the same for all even bubbles, indipendently of 
the location (at least for infinite chains, but for finite chains this is true to 
a high degree of accuracy, unless the bubble is very close to one end of the 
chain). Therefore, in the adiabatic approximation, the dynamical situation of a 
bubble repeats periodically every site that has been travelled. In a heterogeneous 
chain the hessian matrix is, in general, different at any location, thus the above 
argument does not apply, and a faster energy loss takes place. Nevertheless the 
lifetime of the bubble is still satisfying. It is possible to argue, in a qualitative 
way, that heterogeneity acts on the bubble only through the few sites belonging 
to its two ends, since the other sites of the bubble are in the plateau region 
of the Morse potential, where there are no differences between the two types 
of base pairs. With the exposition of the results obtained for chains at room 
temperatures, we will touch again this point. 

We have made simulations in which we have produced, with the same pro- 
cedure as before, a localized excitation; but now the chain is initially in thermal 
equilibrium at 300 K. Again, we have studied both a homogeneous and a hetero- 
geneous chain. For the second case, we have used the same base pair sequence 
that has been adopted for the simulation of the system initially at rest. Let us 
first consider the homogeneous chain. We have made a simulation in which we 
have increased Oq by the same quantity, 2 radians, used in the heterogeneous 
chain at zero temperature. In this way we could make a comparison, under the 
same initial excitation process, between the robustness of the bubble against 
heterogeneity and against this level of thermal noise. Fig. 5 shows the con- 
figurations again at 6 successive times. From the inspection of Figures 4 and 
5 we can note the following points. The amplitude of the bubble is greater in 
the heterogeneous chain initially at rest; nevertheless the distance travelled is 
somewhat greater in the homogeneous chain at 300 K. Therefore it seems that 
the interaction with the phonon bath at this temperature is less effective, in 
taking energy away from the bubble, than the modes of the (disordered) het- 
erogeneous chain. Of course, it is possible to increase the lifetime of a bubble 
by increasing the strength of the initial excitation. In Fig. 6 the configurations 
obtained for the omogeneous thcrmalizcd chain when 0q is initially increased by 
2.8 radians. We can see that the bubble has still a large amplitude when it has 
almost reached the end of the chain; as in the case of the zero temperature, we 
have found that it bounces back. 

The last case that we present is that of the heterogeneous chain at 300 K, in 
Fig. 7; the initial increase in 6*0 is 2.8 radians. We see that, in spite of the two 
possible sources of disturbances to the localized excitation, heterogeneity and 
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thermal noise, the bubble still travels for about 1300 base pairs. 



3.1 Moving open regions 

In order to show how the bubbles move along the chain, we employ here a sim- 
plified version of the model. We make this choice since in this way we can have 
manageable expressions. However, we believe that the same kind of mechanisms 
happen in the complete system, the difference being that the expressions would 
be much more involved, requiring the inversion of trigonometric functions. 

The fundamental equilibrium configuration is that with r„ — Rq and 0„ = 
nOo'i we here expand L„+i,„ (see Eq. (^) in power series and keep only the first 
order terms in (r„ — Rq), (r„+i — Rq) and (6'„ — n8o). Such a procedure is not 
entirely consistent, since we do not make an analogous expansion in the Morse 
potential. However, we have checked numerically that the linear approximation 
for Ln+i,n is not bad in a quite large range of variability of r„, Vn+i and A0„, 
and more importantly this simplification is done only for illustrative purposes. 
Calling y„ = r„ — Rq and z„ = Ro{dn — nQo), and neglecting the kinetic terms, 
the equations of motion of this simplified system in the homogeneous case are: 

ijn = 2ai?(e-2-!/"-e-"^")+c(2;„+i+y„_i-2y„) 

- A^iUn+l + Un-l +"21/71) - AB{Zn+l ~ Zn-l) (8) 
Zn = B^{Zn+i + Z„_i - 2z„) + AS(y„+i - Un-l) (9) 
where the positive coefhcients A and B, coming from the power expansion of 
Ln+i,n, are given by ^ = 2VFf^sin^^ and B = v^l^sinGo. Going to 
the continuum limit, we pose n ^ x, y ^ (p and z — > ■(/;. Taking into account 
partial spatial derivatives up to a suitable degree, and denoting with Um the 
Morse potential, we obtain the following equations: 

d'^cj) 2AB^^ ^ AB^^^ flOl 
dx'^ dx 3 dx^ 

y - ^ p + 12^ 9^ + ^"^^di + i'^^d^ 

We now pose = and = v"^ , and in the following we consider 

the expressions that are found keeping only the first order terms in u^. From 
Eq. ( ITTl ) it is possible to obtain an expression for the spatial derivatives of ip as 
a function of the one that is of interest to us is: 

I-\'^B^)-'BV^B^)^-^d^^ 

with the arbitrary constant d. Substituting in (|l0| ) for ^ and we find an 
equation for (p: 



dx"^ 



Um{^)+9[1 + ^U + 2^v 



A 



2 



,2 J,2 



(13) 
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where c' = c — ^1 — ^^nd g = —2dAB. Let us begin considering the 

static case, v'^ = 0. Then, Eq. (|l3|) reduces to: 

We see that with a positive 5 (i.e., with d < 0) we have the possibihty of a 
locahzed excitation; actually, it is not difficult to see that it must be < 5 < 
^aD. In fact, in that case V {(/)), that diverges exponentially to —00 for cj) — s- —00 
and linearly to +00 for </> +00, has a local maximum for a (small) positive 
value 4>* and a local minimum for a larger value of (j). These two values are given 
by the two solutions of the equation -^Um = 9- Then, solving the Newton- like 
equation ( p^ ) with a "total energy" 1/(0*), we have either (j) = (j)* ov (j) ^ (j)* for 
X ±00, with a localized region where (p reaches a maximum; this maximum is 
given by the value of to the right of the local minimum, where V{(f)) — V{4>*). 
In this case, at the sides of the open region we also have ^ ^ d — 2^41* < 0, 
i.e., a small uncoiling. Summarizing, in order to have a static localized solution 
a constant 0? < is necessary. 

We now go to i;^ > (although sufficiently small since we have made an 



expansion in v ). We rewrite Eq. (13) with 5 = (i.e., d = 0): 



A 



2 



d 



W{cj)) (15) 



The differences from before are that the divergence of V{4)) for (f) +00 is now 
quadratic, and, more important for our argument, the local maximum is for 
= 0. Therefore, it is possible to have travelling localized excitations, at both 
sides of which the field goes to 0, and so does the uncoiling 

Although we have used here a simplified model, it is very likely that in the 
complete system a very similar argument applies. This should explain why in 
the simulations we find, at the sides of the open region, normal twisting. 



4 DISCUSSION AND CONCLUSIONS 

In this work we have studied a model of DNA with two degrees of freedom 
per base pair. The model has been built explicitly to represent the helicoidal 
structure of DNA |]l^. We have analytically shown that, under some uncoiling, 
the system exhibits stable equilibrium configurations in which there is a small 
region, of about 20 base pairs, where the hydrogen bond between complemen- 
tary bases is completely disrupted, allowing access to the genetic code. Then, 
through MD simulations, we have found that these open regions can travel along 
the DNA chain, also when both thermal noise and heterogeneity are present. 

In connection with our results, we would like to mention what has been found 
in 1^ concerning the statistical mechanics of this model (the small differences 
in the Hamiltonian of p5[ | should not be important for this qualitative point). 
In that work the melting transition has been studied. The isothermals in the 
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plane with the thermodynamical variables corresponding to torque and uncoiling 
show clearly a first order phase transition (the computation are performed for 
an infinite chain); during the transition, in which the uncoiling increases at 
constant temperature and torque, the two coexisting phases are interpreted as 
one with normal distance between the complementary bases, and one with the 
hydrogen bonds disrupted. At the end of the transition, only the phase with 
disrupted bonds remains. It is natural to think that these two phases can be put 
in correspondence with the two possible equilibrium configurations that exist in 
a chain with a degree of uncoiling, namely that with a bubble and that without, 
taking into account that our simulations are performed at a temperature (or at 
a torque) below that required for the melting transition. 

We have noted in the previous Section that the travelling bubbles that have 
been generated in our simulations show normal coiling at the sides of the open 
region, and in correspondence the hydrogen bond between complementary bases 
is at the equilibrium distance. This suggests the possibility to have more than 
one travelling bubble at the same time. This fact resembles the situation that 
arises with kinks: only one static kink can be present (and this is easy to un- 
derstand, since the exact solution reaches the positions of the minimum of the 
potential only asymptotically), but for travelling kinks the situation is differ- 
ent Therefore, this model allows transcription to take place at the same 
moment in different portions of the chain. 

In the construction of nonlinear dynamical models of biological systems, one 
of the main properties to satisfy is robustness of the relevant processes. This 
means that changes, at least within suitable ranges, of the external conditions, or 
of the dynamics of the triggering events, or even of the parameters of the effective 
potentials, must not result in essential changes of the main features of the process 
under consideration. The topological index of kinks can not be destroyed by 
perturbations or by thermal disorder, although, of course, a kink can loose 
energy by phonon radiation [Q. Breathers are non topological objects, but some 
simulations pS] have shown that they can survive perturbations. However, the 



larger their energy, the larger their tendency to remain pinned 1 29 , 3C[| ; besides 



as we mentioned in the Introduction, if we look for a breather with a very large 
amplitude, as should be required to allow exposition of the genetic code, then 
we will not find a stable excitation. 

The model used in this work, with two degrees of freedom per base pair, has 
shown to possess the positive features of both kinks and breathers: although 
there is no topological index to prevent eventual decay of the excitation, nev- 
ertheless the "transcription bubble" is quite stable. A necessary condition for 
biological plausibility is that heterogeneity must not prevent propagation of the 
localized excitations in this class of models. We have seen that this is our case, 
also if certainly the life of the bubbles is shorter if the chain is not homogeneous. 
We have argued that this is due to the very structure of the bubble: most of 
the sites belonging to it are in the plateau region of the Morse potential, where 
there are no differences between base pairs; only the few sites at the two ends 
of the bubble experience these differences. 

We have chosen to generate the open region through an unwinding at one 
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of the ends of the chain; this should simulate the initial enzymatic action. We 
would like to say more on the spirit in which this position has been taken. 
There have been attempts to see how breathers can form spontaneously during 
the dynamics, starting from modulational instability, and then growing through 
collisions, that on the average favor the growth of the larger excitations |Q 
We did not show similar results that we have obtained with this model, 
concerning the formation of a bubble out of thermal excitation. However, this 
kind of process lacks any possibility of control about the particular group of 
sites where it begins to take place. Since it is certain that there is an enzymatic 
control on the temporal and spatial beginning of the transcription, we have 
adopted the point of view of mimicking in some way this initial action. Another 
point to be noted is related to the energetics; in the real process of transcription 
enzymes are present all the time; this is in contrast with the strategy generally 
adopted, namely the study of simple autonomous systems. However, one could 
argue that, if the autonomous system shows dynamical processes that already 
enjoy a good degree of stability, then the enzymatic dynamical action (of course 
now we are not concerned with the control activity), that should increase this 
stability and then the lifetime of the process, requires a relatively small amount 
of energy. 

At the beginning of the Introduction we pointed out that these models are 
way too simple to represent faithfully objects as complex as, in general, bio- 
logical systems, and that their use is based, implicitly, on the assumption, or 
better the hope, that their dynamical properties can reproduce those of the real 
system, at least the more important. We think that this point is strongly con- 
nected with the problem of the robustness, previously mentioned with respect 
to changes within the framework of the adopted model. In fact, as long as one 
believes to have captured the essential properties of the dynamics, one has also 
to be sure that an enrichment of the models, necessary to get closer to more 
complete descriptions, does not alter these properties. This is not a minor point: 
if the complexity of the structure of a dynamical model increases, it probably 
becomes more difficult to find a relatively ordered process as a travelling lo- 
calized excitation. We think that this is one of the problems that deserve the 
efforts to be spent in future works. 
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A Equilibrium configurations in heterogeneous 
chains 

Let us suppose that we have found an equilibrium configuration for a homoge- 
neous chain with all Z3„s equal to Da-t- We now want to find the configuration 
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for a chain in which some of the DnS are instead equal to Dq-c- We can use 
the following procedure. We have to solve Equations (||) and (||) for the given 
sequence of the D„s. We rewrite the equations in implicit form as: 



Suppose to know the solution of (|16|) and ( |17| ) for a certain sequence of the I?„s. 
If we now have Dn — > Dn + SDn, then wc can find, at the first order, the new 
solution by solving the linear system of equations: 

^Dn + Y.j-j—5x^ = Q (18) 



^ dOndXr, 



fa,„=0; n=l,...,7V (19) 



where Xm is the generic variable appearing in [/, and in the sums in m the only 
terms that will appear are those belonging to the same site or to the neighboring 
sites. In the linear system ( IT^ ) and ( |l9|) the derivatives are to be computed in 
the old equilibrium configuration, and it has to be solved with respect to the 
5xm- Although this will give the new configuration only at first order, it is 
nevertheless possible to refine the solution up to the desired degree of accuracy 
with iterative steps. If the values of and after solving the system, are 
not yet equal to within the chosen tolerance, then one can solve a system 



in which the terms with 6Dn in (18) are substituted by those values. With a 
suitable choice for the variation 5Dn it is then possible, repeating a sufficient 
number of times the procedure, to start from the solution of (|l^ and (|l^) for 
Dn = Da-t for each n and find the solution in which any subset of the D„s 
has become Dg-c- 



B Equilibrium configurations with a bubble 

To solve in general Equations (^ and (^) we start by posing: 

Ln+l,n — Lq 



Ln+l,n 



-rn+irn sin A6'„ = s (20) 



In principle from ( pO| ) one can have A0„ = /(fn+ij fni s). Substituting this func- 
tion, for A9n and AO^-i, in (|^), one can obtain an equation g^r^^i, r^, fn-i, s) = 
0. Choosing the values of two contiguous sites r„j and r^+i, the equilibrium 
configuration can be computed site by site. But, without any hint on the choice 
of the initial values for r,„ and Vm+i, it is not possible to predict the structure 
along the chain of the configuration that will be found. It would be the anal- 
ogous of computing a static solution of Eq. (|^) with "initial conditions" on 
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chosen at random: the solution will be oscillatory or will (unphysically) 
diverge for x —f +00 or x — cxj; the localized solution such that <f>{x) —> 4>± 
for X ±00, where and 0_ are two degenerate minima for U, requires 
exactly given "initial conditions" . We will show the way in which this problem 
can be solved and therefore how we can find a solution of Equations (||) and 
(||) constituting a nontopological locaHzed excitation. 

As we said in subsection posing L„+i^„ = L (constant in n) we find, 
from Eqs. (||) and (|^), an homogeneous equilibrium configuration with Vn = r 
and A0„ — A6' for given r and A0; we consider here the case L < Lq, that 
gives r > Rq and A9 < ©q. Substituting in (EOf) rn+i = Tn = r and Mn = 
together with L„_|_i_„ = L, we find the corresponding value of s. With this 
value of s we now want to find a configuration such that r„ — > r and A0„ — > A0 
for n ^ ±cxD, with an open region in the middle. Then, for n — > ±00 we write 



Tn — r + 5rn and A6'„ = A6' + 56'„, we substitute in (20) and we expand in power 
series of (5r„, and 59 n, keeping only the first order terms. Therefore we 

have a linear equation from which we obtain: 

6en^l{5rn + 5rn+i) (21) 

where the coefficient 7, that we do not write explicitly here, depends on r, A0 
and the parameters of the model. At this point we expand Eq. (||) in power 
series of 5r„+i, i5r„_i, 59n and (50„_i, we keep only the first order terms 
and we substitute 56n and S9n-i from (^). Then we obtain i5r„_|_i as a function 
of (5r„ and (5r„_i, in the form: 

Srn+i = rjSrn - (22) 

Again, the coefficient r/ depends on r, A0 and the parameters of the model. 
What is of interest here, for what will be said in a moment, is that, when 
r > Rq and A^ < Oq, it is always 77 > 2. If we now pose: 

S^rn = ( ,f " ) (23) 



then from (p^), we can derive the matrix equation 



1 

The eigenvalues of the matrix in (|24|) are given by 



Srn+i = ( ? ] <5r„ (24) 



77: 



± _4 (25) 



The eigenvalues are both real and positive since r] > 2, and A_ = the matrix 
determinant being 1. Then A+ > 1 and A_ < 1. We are therefore assured that, 
if for a given hq we take 5r„„ as the eigenvector corresponding to A+ , then for 
n < no 6rn will tend exponentially to 0. Therefore, the strategy is to take such 
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a (5r„Q , obviously of very small modulus to make the linear approximation in 
( pO| ) and (^) valid, and then to compute A6'„ and r„ for n > uq site by site from 
the complete equations and (|^). One will reach a maximum value of r„ for 
some ni, and for n > ni r„ will decrease; only in the cases where r„j_i = rn-^+i 
or r„j = r„j+i a good localized solution, with r„ ^ r for n ^ +00, will be 
obtained. In the first case we have an odd bubble, and in the second an even 
bubble. To fall in one of these two cases, it is sufficient to perform some tries 
adjusting the modulus of the initial vector Srng- 

In this way, we have found that in a somewhat uncoiled chain (A0„ < Oq) 
there is an equilibrium configuration with r„ — > r > i?o and A0„ —^ A9 < Oq 
for n ±00, where in r the hydrogen bond represented by the Morse potential 
is only slightly stretched, and in the middle r„ is such that the hydrogen bond 
is completely broken (see Fig. 1). 

As already explained, the qualitative picture is not changed in heterogeneous 
chains. 
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Figure 1: Equilibrium configurations for a homogeneous chain with Z)„ — Da-t 
for each n (a), and for a heterogeneous chain with a random choice of £)„. We 
show only the central region, that with the bubble. In this and in the figures 
from 3 to 7 the unit of length is A. 
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Figure 2: Eigenvector of the smallest eigenvalue of the hessian matrix of a 
homogeneous chain with 100 base pairs. The full line corresponds to the radial 
displacement r, while the dashed line to the transversal displacement i?o^- In 
the vertical axis we have arbitrary units. 
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Figure 3: Successive configurations in a homogeneous chain (all D^s equal to 
Da-t) of 2500 base pairs initially at rest in the fundamental equilibrium config- 
uration. In the vertical axis we have Ar = r — Rq- The bubble travels towards 
the right. It has been formed by an unwinding given by an increase of 1.25 
radians in ^o- In order to show all the configurations in a single graph, in the 
vertical position of each one there is an offset of 1 with respect to the previous. 
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Figure 4: Same as Fig. 3, but for a heterogeneous chain, with random choice of 
the DnS, and where has been initially increased by 2 radians. 



22 



Ar 




5 



3 ^Mk#iiA«J#^^ 
2 ifW^Wfi^^ 



1 



1 



i#H#i*< 








500 



1000 1500 

n 



2000 2500 



Figure 5: Same as Fig. 3, but for a chain in thermal equilibrium at 300 K, and 
where 6o has been initially increased by 2 radians. 



23 



Ar 



8 
7 
6 
5 
4 
3 
2 
1 

-1 








500 



1000 1500 

n 



2000 2500 



Figure 6: Same as Fig. 5, but with initially increased by 2.8 radians. 
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Figure 7: Successive configurations in the same heterogeneous chain of Fig. 
4, but in thermal equilibrium at 300 K and with initially increased by 2.8 
radians. 
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